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In the context of growing networks, we introduce a simple dynamical model that unifies the generic 
features of real networks: scale-free distribution of degree and the small world effect. While the 
average shortest path length increases logartihmically as in random networks, the clustering coef- 
ficient assumes a large value independent of system size. We derive expressions for the clustering 
coefficient in two limiting cases: random (C ~ (In TV) 2 /N) and highly clustered (C = 5/6) scale-free 
networks. 
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PACS: 87.23.Ge, 89.75.Hc, 89.65.-s 

Many systems can be represented by networks, i.e. as 
a set of nodes joined together by links indicating interac- 
tion. Social networks, the Internet, food webs, distribu- 
tion networks, metabolic and protein networks, the net- 
works of airline routes, scientific collaboration networks 
and citation networks are just some examples of such sys- 
tems. [E 11 1 . Most of these networks share three promi- 
nent features: (A) The average shortest path length L 
is small. In order to connect two nodes on the graph, 
typically only a few edges need to be passed. (B) The 
clustering coefficient C is large. Two nodes having a 
common neighbor are far more likely connected to each 
other than are two nodes picked at random. (C) The 
distribution of the degree is scale- free, i.e. it decays as a 
power-law. The absence of a typical scale for the connec- 
tivity of nodes is often related to the organization of the 
network as a hierarchy. 

In this Letter we present the first attempt to explain 
the empirical observations by a model of network self- 
organization according to simple rules. To our best 
knowledge, all previous approaches of modeling complex 
networks have only partially taken into account the above 
properties (A),(B) and (C). Co-occurrence of high clus- 
tering and short distance between nodes was originally 
termed as the "small world" phenomenon. It can be 
obtained by departing from a regular lattice, randomly 
rewiring links with a probability p « 1 1. However, 
networks created in this way display a degree distribu- 
tion sharply peaked around the mean- value; a power-law 
decay is not observed. Barabasi and Albert have given 
a first explanation of the scale-free distribution by refor- 
mulating Simon's model [[l2||l3| | in the context of growing 
networks. New nodes join the network by attaching m 
links to other nodes, chosen according to linear prefer- 
ential attachment. This means that a node obtains one 
of the new links with a probability proportional to the 
number of links it already has. The algorithm, hence- 
forth called BA model, generates networks with a degree 
distribution P(k) = 2m 2 k~ 3 with k > m. However, as 
the system size N grows, the clustering coefficient ap- 
proaches zero as the network size increases. The value 



of the clustering coefficient predicted by the BA model 
is typically several orders of magnitude lower than found 
empirically 

Recently an alternative algorithm has been suggested 
pTJ to account for the high clustering found in scale-free 
networks. The topology of the networks produced is sim- 
ilar to one-dimensional regular lattices. The connectivity 
(coordination number), however, is not constant but fol- 
lows a power-law distribution causing the clustering to 
be even higher than in regular lattices. Here we general- 
ize the model to include long-range connections. We find 
that a small ratio of long-range connections is sufficient 
to obtain small path length, keeping the high clustering 
and scale-free degree distribution of the original model. 

Let us recall the high clustering model as originally 
defined in Ref. []14J: Each node of the network is as- 
signed a binary state variable. A newly generated node 
is in the active state and keeps attaching links until 
eventually deactivated. Taking a completely connected 
network of m active nodes as an initial condition, each 
step of the time-discrete dynamics consists of the fol- 
lowing three stages: (i) A new node joins the network 
by attaching a link to each of the m active nodes, (ii) 
The new node becomes active, (hi) One of the active 
nodes is deactivated. The probability that node i is 
chosen for deactivation is pi — ak~ with normalization 
a = J^ kj . The model generates networks with degree 
distribution P(k) = 2m 2 k~ 3 (k > m) and average con- 
nectivity (k) = 2m []14J. Regarding topological properties 
the networks are reminiscent of one-dimensional regular 
lattices. The path length increases linearly with system 
size whereas the clustering coefficient quickly converges 
to a constant value. 

Long-range connections are introduced into the model 
by modifying stage (i) in the dynamical rules as follows. 
For each of the m links of the new node it is decided ran- 
domly whether the link connects to the active node (as 
in the original model) or it connects to a random node. 
The latter case occurs with a probability (i. In this case 
the random node is chosen according to linear preferen- 
tial attachment, i.e. the probability that node j obtains 



a link is proportional to the node's degree kj. For /i = 
we recover the high clustering model. The case \x = 1 is 
the BA model. Varying fj, in the interval [0, 1] allows us 
to study the cross-over between the two models. We are 
especially interested in the behaviour of the topological 
properties, namely the average shortest path length and 
the clustering coefficient, as a function of the cross-over 
parameter /i. Figure [0 shows the variation of the av- 
erage shortest path length and the clustering coefficient 
with the parameter /i. When increasing fj, from zero to 
small finite values, the average shortest path length L 
drops rapidly and approaches the low value of the BA 
model. The clustering coefficient C remains practically 
constant in this same range < /i « 1. We have checked 
that the power law distribution of the degree (not shown 
here) is still obtained in this range. Thus the model with 
0<|i<l reproduces the three generic properties (A), 
(B) and (C) of real-world networks. The model is robust 
against changes in the rule for the introduction of ran- 
dom links. The small world transition shown in Fig. p] 
does not change significantly when the attachment is not 
preferential, i.e. every node receives a random link with 
the same probability. 
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FIG. 1. Small world effect in scale- free networks. Intro- 
ducing ratio (i<C 1 of random links into the highly clustered 
scale free networks dratically reduces the typical distance L 
between nodes. However the strongly interconnected neigh- 
borhoods of the original model (/i = 0) are preserved, as the 
clustering coefficient remains at its large value. Only when fi 
reaches the order of 1 the clustering coefficient drops signifi- 
cantly. All plotted values are averages over 100 independent 



realizations. The networks have TV 
degree (k) = 20. 
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The observed drop in the average shortest path length, 
L, is due to a qualitative change in the dependence of L 
on the system size. In Fig. we show L as a function of 
the system size N for fi = and fi = 0.1. For /i = 0, 



the average shortest path length grows linearly L oc N, 
the same behavior observed in one-dimensional regular 
lattices. In clear contrast, a logarithmic growth of L is 
obtained for /.i = 0.1, L oc IniV. The logarithmic increase 
of L with system size is typical of the small world effect 
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FIG. 2. Average shortest path length L as a function of 
system size N. In networks without long-range connections 
(jj, — 0) the relation between L and N is linear. This is seen 
best in the inset with linear scales on both axes. When at- 
taching a fraction fi — 0.1 of all links to random nodes instead 
of the currently active ones, L grows merely logarithmically 
with N. The values can be fit well by a straight line in the 
plot with logarithmic N scale (main panel). All values plotted 
are averages over 100 independent realizations. The average 
degree is (k) — 20. 

In the remainder of the Letter we study the evolution 
of the clustering coefficient C as a function of network 
size N. We begin by deriving C analytically for the two 
limiting cases /j, — (the high clustering model) and 
H = 1 (the BA model). 

Consider first the case \x = 0. At any given time step 
the set of active nodes is completely interconnected, sim- 
ply because a newly generated node always connects to 
all active nodes before being activated itself. It follows 
that a node I with degree ki = m has C/ = 1 because all 
the m(m — l)/2 possible links between neighbors of I ac- 
tually exist. If I is deactivated in the time step of its gen- 
eration its neighborhood does not change any more and 
it keeps C\ = 1. Otherwise a node i ^ I is deactivated. 
In the next time step the node I + 1 connects to / and 
all its neighbours apart from node i. Then fc;(fc; — 1) — 1 
of the possible ki(ki — 1) links between neighbours of I 



exist, where now ki = m + 1. If node I keeps being ac- 
tive a node j ^ I is deactivated. Node I + 2 connects to 
all neighbors of I apart from i and j causing another 2 
links to be missing in the neighborhood of I. See Fig. Q 
for an illustration. By induction follows that after n it- 
erations X)"=i v — n ( n + l)/2 links are missing in the 
neighborhood of I. 

Thus the clustering Ci depends only on the degree ki. 
The exact relation is 



C{k) 



1 



(k — m + l)(fc — m) 
k(k-l) 



(1) 



The clustering coefficient C can be obtained as the mean 
value of C(k) with respect to the degree distribution 
P(k) = 2m 2 k~ 3 , k > m. The result is 



C 



30m 



(fc — m + l)(fc — m) 
fe(fe-l) 

+ o( m - 2 ) . 



2m 2 k~ 3 dk (2) 
(3) 



In the limit of large m the clustering coefficient is 5/6. 
It is worth noting that this value is higher than for reg- 
ular lattices. The value 5/6 rs 0.83 is similar to the 
one obtained in the film actor network (0.79), the coau- 
thorship network in neuroscience (0.76), and networks of 
word synonyms (0.7) fl5|. 
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FIG. 3. Illustrating the calculation of the clustering co- 
efficient of the highly clustered model (/x = 0, m — 2). The 
encircled node is the node I under consideration. Links of this 
node are drawn as thick lines, links between its neighbors are 
thin lines. The dotted lines are links that are "missing" in 
the neighborhood of I. Active nodes are filled circles, inactive 
nodes are unfilled. Further explanation see text. 

Let us now consider the BA model (/i = 1). When 
adding node j to the network, the probability for one link 
of node j to connect with node i is the ratio of the degree 
of the node i, ki, and the sum of all nodes' degrees in the 



network, 2m j. Thus the probability for the existence of 
a link from j to i is given by 



Pr{(ij)} = 



MJ) 

2mj 



(4) 



where the prefactor m takes into account, that m links 
per node are added to the network. By ki(j) we denote 
the degree of node i at the time that node j is added. 
Neglecting small fluctuations, the degree of the z-th node 
is ki{j) = m(j/i) - 5 according to Ref. p| . Inserting into 
Eq. (Eh gives 



Pr{(ij)} = ^(ij) 



-0.5 



(5) 



The local clustering C/ (TV) of the node I in a network of 
size N is defined as the number of links between neigh- 
bours of I, divided by the total number of pairs of neigh- 
bors I has. Only taking into account expectation values 
and treating the nodes as a continuum, we find 



r N 



Q(N) = 



f^/f djPr{(H)}Pr{(/j)}Pr{(u)} 



kf(N) 



(0) 



where we have approximated the total number of neigh- 
bors by kf/2. Evaluating the probabilities according to 



Eq. 



Q(N) 



and using kf(N) 



m N/l yields 



$kf(N) J, 

m(lniV) 2 



N pN 

dij dj(ur o - 5 (ijr o - 5 (ij)-°- 5 (7) 



8 A^ 



(8) 
(9) 



The average value of the local clustering Ci does not 
depend on the node I under consideration. The net- 
works generated by the BA-model show homogeneous 
clustering, despite the inhomogenous scale-free connec- 
tivity. With increasing network size A, the clustering 
coefficient decreases as A^ _1 in leading order. The differ- 
ence with respect to a random graph, having a Poisson 
distribution of degree, is seen only in the logarithmic cor- 
rection (In A') 2 . 

Figure H, upper panel, shows the clustering coefficient 
obtained from numerical simulations. For /i = we find 
an asymptotic value of approximately 0.83 as predicted 
analytically. Also for /j, = 0.1 convergence to a finite 
value is observed. The BA model (fi = 1.0) displays 
a rapid decay of C as the network size A^ grows. The 
behavior of C(N) in the BA model is analyzed in the 
lower panel of Fig. |4|, clearly supporting the expression 
in Eq. g. C(N) is found to be inversly proportional to 
the system size, with logarithmic corrections. A pure 
power law with exponent -0.75 as proposed in Ref. |lq | 
describes the numerical data less accurately. 
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J r a u=0.0 (highly clustered model) 
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FIG. 4. Upper panel: The clustering coefficient C as a func- 
tion of network size. Networks generated with /j, — 0.0 quickly 
reach the large value predicted by the analytical calculations 
(C fa 0.83). With 10% long-range connections (jj, - 0.1) the 
clustering is lower but still approaches an asymptotic value 
clearly above zero. In the BA-model (fi — 1.0) the cluster- 
ing coefficient decreases drastically with growing system size. 
Each of the three data sets is an average over 100 independent 
simulation runs. Lower panel: For the BA model, the function 
(C(N)/N) ' 5 grows as lnA r , giving a straight line in logarith- 
mic-linear plot. This indicates very good agreement with the 
analytical result C(N) oc N~ 1 (lnN) 2 . For comparison, the 
theoretical curve C(N) oc JV -0,75 is shown, as suggested in 
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In summary, we have defined and analyzed a model 
of self-organizing networks with high clustering, small 
path length and a scale-free distribution of degree. The 
networks with these generic properties are obtained as 
a cross-over between highly clustered scale-free networks 
p4| and scale-free random graphs pi[ . The dependence 
of the topology on the cross-over parameter is very simi- 
lar to the small world transition observed when introduc- 
ing random links into a regular grid [BJ. Therefore our 
studies make a connection between small world graphs 
and scale-free networks, essentially unifying both con- 
cepts in one model. 



